Search CORE

105 research outputs found

Dundee Discussion Papers in Economics 245:Growth rate estimation in the presence of unit roots

Author: Chatterji Monojit
Choudhury Homagni
Publication venue: 'University of Dundee'
Publication date: 01/01/2010
Field of study

University of Dundee Online Publications

Dundee Discussion Papers in Economics 244:The changing inter-industry wage structure of the organised manufacturing sector in India, 1973-74 to 2003-04

Author: Chatterji Monojit
Choudhury Homagni
Publication venue: 'University of Dundee'
Publication date: 01/01/2010
Field of study

University of Dundee Online Publications

Multi-Agent Simulation of Emergence of Schwa Deletion Pattern in Hindi

Author: Anupam Basu
Monojit Choudhury
Sudeshna Sarkar
Publication venue
Publication date
Field of study

Recently, there has been a revival of interest in multi-agent simulation techniques for exploring the nature of language change. However, a lack of appropriate validation of simulation experiments against real language data often calls into question the general applicability of these methods in modeling realistic language change. We try to address this issue here by making an attempt to model the phenomenon of schwa deletion in Hindi through a multi-agent simulation framework. The pattern of Hindi schwa deletion and its diachronic nature are well studied, not only out of general linguistic inquiry, but also to facilitate Hindi grapheme-to-phoneme conversion, which is a preprocessing step to text-to-speech synthesis. We show that under certain conditions, the schwa deletion pattern observed in modern Hindi emerges in the system from an initial state of no deletion. The simulation framework described in this work can be extended to model other phonological changes as well.Language Change, Linguistic Agent, Language Game, Multi-Agent Simulation, Schwa Deletion

Research Papers in Economics

Dundee Discussion Papers in Economics 244:The changing inter-industry wage structure of the organised manufacturing sector in India, 1973-74 to 2003-04

Author: Chatterji Monojit
Choudhury Homagni
Publication venue: 'University of Dundee'
Publication date: 01/01/2010
Field of study

University of Dundee Online Publications

LoNLI: An Extensible Framework for Testing Diverse Logical Reasoning Capabilities for NLI

Author: Aditya Somak
Choudhury Monojit
Tarunesh Ishan
Publication venue
Publication date: 04/12/2021
Field of study

Natural Language Inference (NLI) is considered a representative task to test natural language understanding (NLU). In this work, we propose an extensible framework to collectively yet categorically test diverse Logical reasoning capabilities required for NLI (and by extension, NLU). Motivated by behavioral testing, we create a semi-synthetic large test-bench (363 templates, 363k examples) and an associated framework that offers following utilities: 1) individually test and analyze reasoning capabilities along 17 reasoning dimensions (including pragmatic reasoning), 2) design experiments to study cross-capability information content (leave one out or bring one in); and 3) the synthetic nature enable us to control for artifacts and biases. The inherited power of automated test case instantiation from free-form natural language templates (using CheckList), and a well-defined taxonomy of capabilities enable us to extend to (cognitively) harder test cases while varying the complexity of natural language. Through our analysis of state-of-the-art NLI systems, we observe that our benchmark is indeed hard (and non-trivial even with training on additional resources). Some capabilities stand out as harder. Further fine-grained analysis and fine-tuning experiments reveal more insights about these capabilities and the models -- supporting and extending previous observations. Towards the end we also perform an user-study, to investigate whether behavioral information can be utilised to generalize much better for some models compared to others.Comment: arXiv admin note: substantial text overlap with arXiv:2107.0722

arXiv.org e-Print Archive

LLM-powered Data Augmentation for Enhanced Cross-lingual Performance

Author: Aji Alham Fikri
Choudhury Monojit
Whitehouse Chenxi
Publication venue
Publication date: 22/10/2023
Field of study

This paper explores the potential of leveraging Large Language Models (LLMs) for data augmentation in multilingual commonsense reasoning datasets where the available training data is extremely limited. To achieve this, we utilise several LLMs, namely Dolly-v2, StableVicuna, ChatGPT, and GPT-4, to augment three datasets: XCOPA, XWinograd, and XStoryCloze. Subsequently, we evaluate the effectiveness of fine-tuning smaller multilingual models, mBERT and XLMR, using the synthesised data. We compare the performance of training with data generated in English and target languages, as well as translated English-generated data, revealing the overall advantages of incorporating data generated by LLMs, e.g. a notable 13.4 accuracy score improvement for the best case. Furthermore, we conduct a human evaluation by asking native speakers to assess the naturalness and logical coherence of the generated examples across different languages. The results of the evaluation indicate that LLMs such as ChatGPT and GPT-4 excel at producing natural and coherent text in most languages, however, they struggle to generate meaningful text in certain languages like Tamil. We also observe that ChatGPT falls short in generating plausible alternatives compared to the original dataset, whereas examples from GPT-4 exhibit competitive logical consistency.Comment: EMNLP 2023 Main Conferenc

arXiv.org e-Print Archive